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Abstract 

A variety of selection-mutation models for DNA (or RNA) sequences, well 
known in molecular evolution, can be translated into a model of coupled Ising 
quantum chains. This correspondence is used to investigate the genetic vari- 
ability and error threshold behaviour in dependence of possible fitness land- 
scapes. In contrast to the two-state models treated hitherto, the model ex- 
plicitly takes the four-state nature of the nucleotide alphabet into account 
and allowes for the distinction of mutation rates for the different base sub- 
stitutions, as given by standard mutation schemes of molecular phylogeny. 
As a consequence of this refined treatment, new phase diagrams for the error 
threshold behaviour are obtained, with appearance of a novel phase in which 
the nucleotide ordering of the wildtype sequence is only partially conserved. 
Explicit analytic and numeric results are presented for evolution dynamics and 
equilibrium behaviour in a number of accessible situations, such as quadratic 
fitness landscapes and the Kimura 2 parameter mutation scheme. 

1 Introduction 

One prominent phenomenon in the theory of molecular evolution that has also 
attracted considerable attention in statistical physics is the so-called error thresh- 
old. It describes the breakdown of genetic order in mutation-selection models for 
mutation rates surpassing a certain critical value. The prototype model for the de- 
scription of the error threshold is Eigen's quasispecies model in sequence space |7[ || 
(which is effectively equivalent to a coupled mutation-selection model in population 
genetics, cf jsj), originally designed for the description of prebiotic RNA evolution. 
However, the threshold is supposed to be a phenomenon that should occur in a 
rather general class of mutation-selection models. 

In order to set up a mutation-selection model that is tractable by analytical 
(or at least numerical) methods, severe simplifications of the original biological 
situation seem to be indispensable. Analytical approaches generally have to restrict 
to the treatment of infinitly large populations and rather simple fitness functions, 
such as the sharply peaked landscape of Eigens original model. Another common 
approximation, also used in previous studies of the quasispecies model, amounts 
for the simplified representation of genotypes as binary strings. In the context 
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of molecular evolutionary theory, this may be thought of as representing DNA or 
RNA strands by sequences of purins and pyrimidins, hence with only two states 
per site, neglecting the fact that genetic information is really given by a four-letter 
alphabet. In this article, we present a four-state mutation-selection model which 
is capable to describe the full nucleotide alphabet and incorporates the standard 
mutation schemes of molecular phylogeny. In particular, the phase diagramms are 
discussed in detail which are more polymorphic than for the two-state model. This 
shows that, for a full understanding of the error threshold behaviour in molecular 
evolution, investigations can not be restricted entirely to the study of two-state 
models. 

One important step towards an understanding of the threshold phenomenon has 
been its identification with an equilibrium phase transition in physics by the trans- 
lation of a time-discrete version of the quasispecies model into the transfer matrix 
of an anisotropic two-dimensional Ising model [l5| . This equivalence was further 
exploited to study various aspects of the error threshold with methods from statis- 
tical physics |l^, 18 . It turns out, however, that the anisotropy of that 



model is not so easy to handle and the analysis of the relevant biological quantities 
(which correspond to certain surface properties of the Ising model) remains an in- 
volved problem. Due to the complications of the model, almost all results obtained 
so far are approximate or numerical. The only exact result for the sharply peaked 
landscape has been worked out via a different analogy to a model of directed 
polymers, using the specific properties of that very special fitness landscape. 

An alternative approach to the analysis of mutation-selection models and the 
error threshold which avoids some of the problems of the anisotropic Ising model 
has been brought up in |§, 0. Here, the starting point on the biological side is 
a slightly changed model which describes the evolution of a population with over- 
lapping generations in continuous time. It turns out that, after a reformulation in 
tensor products, the two-state version of this model is equivalent to the Hamiltonian 
of an Ising quantum chain. Thereby, the change to continuous time in the biological 
description corresponds to the anisotropic limit that connects the two-dimensional 
Ising model and the quantum chain in physics (cf. @). The quantum chain model 
is technically easier to handle, and exact results for two non-trivial fitness land- 
scapes, namely Onsager's landscape and the quadratic fitness function, have been 
worked out ||, H. 

Accordingly, we extend this latter approach to a full four-state model in this 
study. The quantum chain analogy allows to use well-known methods from statis- 
tical mechanics for the solution of the model, so that we do not have to dwell on 
technical details here. For an extended presentation of methods (with regard to 
the two-state model) using techniques from rigorous mean field theory, we refer to 
(2^, ^ . The main focus is instead on the discussion of the threshold behaviour and 
in particular the increased complexity of the phase diagram due to the considera- 
tion of the four-state nature of biological information and the refined schemes of 
molecular mutation rates. 

In the following section, we start with a presentation of the biological foundations 
of our model. Only thereafter, we will introduce the quantum chain model in Section 
3. In Section 4, analytical and numerical results are presented for a number of 
specific four-state models with permutation invariant fitness landscapes. Also the 
properties of finite sequences and the evolution dynamics will be studied. We close 
with a summary of our results and a discussion of open problems in Section 5. 
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2 Biological foundations 



Genetic information is coded in DNA (and RNA) molecules. These are heteropoly- 
mers of four units (nucleotides) which differ in a specific base. The essential aspect 
of a DNA sequence is captured in a string over a four-letter alphabet 

crey = l/ixF2X---xVW; V^ = {A,C,G,T} (1) 

where each letter represents a particular base: A and G for adenine and guanine (the 
purins), C and T for cytosine and thymine (the pyrimidins). In RNA sequences, 
T is replaced by U for uracil. We will therefore treat the 4^ different sequences 
of a fixed, finite length N as our genotypes (which may be thought of as coding 
for something, such as a virus or an enzyme). Disregarding environmental effects, 
we may identify a collection of genotypes with a population of haploid 'individuals'. 
Evolution then describes the change of the population composition in time. 

A standard model for the evolution of an infinite, asexually reproducing popu- 
lation under the basic forces of mutation and selection which works in continuous 
time is given by the following system of non-linear differential equations Q 

P.it) = K - • (2) 

cr' 

Here, p^^it) denotes the relative frequency of genotype cr at time t with correspond- 
ing Malthusian fitness (replication rate minus death rate) r^, and 

^W = E'^-^^-W (3) 

cr 

is the mean fitness of the population. It is the origin of the non-linearity in (^. 
Finally, ma-cr' is the (time independent) rate at which cr' mutates to cr. This 
framework has originally been defined in classical population genetics j^. In the 
sequence space context, it has been introduced in and has been called the para- 
muse (parallel mittation-selection) model, since it assumes mutation and selection 
to act independently and in parallel at each instant of time. The model ignores 
recombination and genetic drift due to finite population size. Both assumptions 
can be considered as fairly reasonable at least in the context of the evolution of 
viruses or bacteria where populations can be huge and recombination is absent, or 
the nucleotides are tightly linked. In the following subsections, the basic processes 
of mutation and selection shall be described in some detail. 



2.1 Mutation 

We take mutation as a point process acting independently on all sites, ignoring 
more complicated mechanisms, such as insertions or deletions. Molecular mutation 
rates shall be chosen according to the following scheme, known as the Kimura 3 ST 
model in molecular phylogeny JTz] , p3[ : 

Within this general setup, a number of simpler models is contained, which treat 
mutation at different levels of sophistication. In the simplest approach, the mutation 
rates between all four nucleotides are assumed to be equal (/^i = /i2 = Ma)- This is 
the so-called Jukes-Cantor mutation scheme. While this simple frame already seems 
to be sufficient for a number of applications, measurements reveal that there are 
indeed pronounced differences in the mutation rates that should be accounted for in 
more realistic models. In particular, the transitions between the two purins (A,G) 
and the two pyrimidins (C,T) are much more frequent than the purin-pyrimidin 
mutations which are called transversions. This may range up to relative differences 
of /Lii « /X3 ~ /^2/2 in the nucleus and /xi « /X3 ~ /X2/4O in mitochondrial DNA ||l7|] . 
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Figure 1: Molecular mutation scheme according to the Kimura 3 ST model. 



A mutation scheme with /i2 > /^i = ^J■3 is known as the Kimura 2 parameter model. 
The full Kimura 3 ST scheme, finally, also accounts for the small difference between 
/ii and /i3, such that fj,2 > ^J■l > Ma- 

Implementing this mutation model into the evolution equation (H), we obtain 
the following mutation rates between genotypes (i G {1, 2, 3}) 

Hi, di{(T, cr') = rfo-CT' = 1 
-NJ:^^,,, a = CT' . (4) 

0, d„„: > 1 

Here, 

di(cr,cr') = #A^c'(cr,cr') + #G^T(cr,(T') 

d2{<T,a') = #A^Gi<T,(T') + #c^Ti<T,a') (5) 

d3i(T,cr') = i^A^T{o-,(T') + #c^c4{(^,o-') 

are restricted Hamming distances between cr and cr' . In (j^), #x^y (c, ""') counts 
the positions at which X and Y are exchanged in cr and cr' . Finally, 

drrrr' = di{cr, cr') + d2{cr , cr') + dz{cr, cr') (6) 

is the total Hamming distance. Note that the choice of the diagonal term TOcto- in 
(jj) just accounts for probability conservation (X^o- Per = 0) ii^ the mutation part of 
the evolution equation ph. 



2.2 Selection and fitness landscape 

Whereas the mutational part of the dynamics is fairly well understood at least on the 
microscopic (molecular) level, the relation of genotype and fitness, which defines the 
respective selective success, is notoriously complex. Following the standard notion 
in molecular evolution, we define the fitness function (or fitness landscape) 

f-rr^r^ (7) 

as a mapping from the configuration space V = {A, C, G, T}^ into the real num- 
bers, assigning a reproduction rate (Malthusian fitness value) r^. to each genotype. 
Implicitly, the fitness function incorporates all the complicated interactions between 
the sites. These interactions are typically long-ranged (since RNA strands or pro- 
teins fold in three dimensions), highly correlated, and give rise to rather rugged 
landscapes. Especially in the context of RNA evolution, the construction and char- 
acterization of fitness landscapes has motivated numerous studies, see e.g. ||2^ for 
a review. 

Below we will show how the evolution equation (^, with an arbitrary choice of 
the fitness function, can be adapted to the methods from statistical physics by a 
reformulation in a quantum chain framework. As an application, we then present 
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Figure 2: Permutation invariant configuration space of tlie four-state model in 
surplus coordinates. 

a study (including analytical and numerical results) for specific examples from the 
class of permutation invariant fitness functions. Here, due to equivalence of all sites, 
the fitness of a given genotype is solely a function of its restricted Hamming dis- 
tances from the so called wildtype sequence with optimal fitness which we choose as 
the reference genotype. This particularly simple class of fitness landscapes is widely 
used, as a canonical first approximation, especially in multilocus theory. Also in the 
context of sequence space evolution, fitness functions of this type have been used 
in a number of studies on the two-state model |l6|, ^ ||, |2^. To implement 
the approach in our four-state model, we fix an arbitrary sequence, denoted by 
as the wildtype. We will only consider directional selection here towards a 
unique genotype with optimal fitness. The fitness of any other sequence is then 
determined by the restricted Hamming distances di relative to <t++. Permutation 
invariance with respect to the position in the sequence thus leads to a drastic re- 
duction of dimensions. For the four-state model, the effective configuration space 
forms a tetrahedron in 3d (see Fig. ^) and is conveniently represented in Cartesian 
coordinates which we shall call (following the surplus components: 



With this choice, any unstructured random sequence has coordinates = (with 
probability 1 in the limit oo). Any positive value of a surplus component, 

on the other hand, signals a non-trivial overlap of the sequence with the wildtype 
cr_|__|_. In particular, si measures the surplus of sites with purins or pyrimidins as 
given in cr++ over the purin-pyrimidin mutated sites. 

Within this frame, a natural class of permutation invariant fitness functions is 



which includes the following special cases 

• Setting ai > and = 0, we obtain the purely additive Fujiyama landscape 
without genetic interactions. Here, every mutation relative to the wildtype 




(8) 
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has a fixed deleterious effect, independent of any other mutation that may be 
present in the sequence. The additive landscape is a canonical zeroth-order 
approximation, ignoring any kind of genetic interactions. In the context of 
sequence evolution, this fitness function has been discussed e.g. in |^. 

• With the choice ai > ~ji > 0, the model corresponds to a concave quadratic 
fitness function (with directional selection) as it is frequently met in multilocus 
theory. Due to the gene interactions, existing mutations tend to aggravate 
further ones, which is called positive epistasis. 

• For ttj > and 7j > 0, we finally obtain a convex fitness function for di- 
rectional selection with long-range gene interactions and negative epistasis 
(existing mutations tend to alleviate further ones). Since we want to have 
0-++ as unique wildtype sequence and a fitness function which is monotonous 
in the surplus components, we restrict / to the octant Si > and (smoothly) 
truncate the fitness function by introduction of a step function Q{si) whenever 
frequencies of genotypes with Si < are non-zero: 

3 

1=1 

The variables ai and 7^ may further be used to distinguish between the effects of 
the different types of mutations (as defined in Fig |l|) on the fitness. In this article, 
we will present explicit results for the two following cases: 

1. For the simplest choice, ai ^ 02 ~ and 71 = 72 = 73, any mutation away 
from the wildtype has the same effect. Together with the Jukes-Cantor muta- 
tion scheme, symmetry here leads to equal values of the surplus components 
in the mutation-selection equilibrium. The model may thus also be thought 
of as a two-state model, where any site is only regarded as occupied either 
with a wildtype or with a mutant nucleotide. In contrast to the simple two- 
state model of Q|, however, there is an effectively asymmetric mutation rate 
between wildtype and mutant in the case considered here. 

2. In a more refined model, wc distinguish between transitions and transversions. 
In the mutational part, this is done by applying the Kimura 2 parameter mu- 
tation scheme. In the fitness function, we take into account that the delete- 
rious effects of the transversions often dominate over those of the transitions: 
ai > a2,3 and/or 71 > 72,3. 



3 Quantum chain model 

3.1 Symmetries 

Since mutation is a random process that is independent of the fitness values of the 
genotypes involved, the molecular mutation scheme consequently makes no refer- 
ence to fitness concepts like the wildtype. Biological observables measurable from 
sequence data, such as the surplus components (^), and also the fitness functions 
as defined in (^) or (|lO|), on the other hand, are defined relative to the wildtype 
sequence. In order to set up these concepts in a common framework, it is convenient 
to reformulate also the mutational part of the evolution equation in coordinates rel- 
ative to the wildtype. This may always be done due to certain symmetries inherent 
in the mutation scheme of Fig. |l|. 

The basic symmetry of the mutation scheme, if all three mutation rates /ii, /i2, /is 
are pairwise different, is C2 x C2 (Klein's 4-group), generated by two involutions. If 
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we write the operations in standard permutation notation, we can take as generators 
the transformations 

[A C G T\ (A C G T 

\C A T g) \g T a C 

both being the product of two transpositions. This symmetry may now be expfoited 
for a redefinition of the mutation scheme in wildtype coordinates. To this end, we 
fix, for every site of the wildtype sequence, the element of the 4-group (in the above 
representation) with the letter of the wildtype nucleotide in the first position (e.g. 
the string {T,G,C,A) for wildtype nuceotide T). An alternative representation of 
the configuration space in wildtype coordinates as 

cr e y± = y± X X • • • X ; v.[^ = {++, -+, +-, — } (12) 

is now given by the mapping, on each site, of the string of labels (++, — h, H — , ) 

to the symmetry element of 4-group defined above. With this notation, the three 
types of mutations included in the Kimura 3 ST scheme simply switch the signs of 
the labels: ±± =F± at rate /ii, ±± ±=F at rate /i2, and ±± at rate /i3. 

Higher symmetries of the mutation model are obtained if mutation rates are 
equal. For the Kimura 2 parameter scheme, fii = ^ ^2, the operation 

A^C^G^T^A^ (c G T a) ^'^^ 

is also a symmetry and generates a cyclic group C4. Together with the previous C2 x 
C2, this generates a dihedral group, I?4, with 8 elements. Finally, if fJ-i — fJ.2 — fJ-s, 
we additionally get the simple transposition A ^ C and have the full permutation 
group 5*4 as symmetry. Note that ^4, which corresponds to the full tetrahedral 
group with 24 elements, is also the symmetry group of the configuration space of 
permutation invariant configurations visualized in Fig. |^. The global symmetry 
(with the same transformation acting at each site simultaneously) of our class of 
mutation-selection models with fitness functions according to (^) is therefore always 
a subgroup of ^4. In particular, the symmetric fitness model with ai — a2 — aa, 
7i =72 =73, and Jukes-Cantor mutation scheme possesses C^y symmetry, or the 
full tetrahedral symmetry if the linear part in the fitness function vanishes (a^ — 0). 
The transition-transversion model finally, with ai > q;2 = 03, or 71 > 72 = 73, and 
Kimura 2 parameter mutation has simple C2 symmetry, or 1)4 symmetry if = 0. 
In the latter case, the combination of 72 = 73 with /^i = /ia is necessary, not a 
misprint. Other combinations with global Z?4 symmetry are (71 = 73; /i2 = /^s) and 

(71 = l2]^^l = ^J.2)■ 




3.2 Construction 

With the above preparations, we may now follow the lines of |2^ where the 
two-state model is treated. 

In a first step, we represent the 4''^-dimensional vector space in which we describe 
the genotype frequencies as the A^-fold tensor product space W = ^f^iWj. Hereby, 
the configuration space is canonically embedded in W by the mapping of the 
elements of V^^ onto the basis vectors {e^^, ,6^ , } of Wj ~ R"*. Since the 
nonlinear part in the differential equations (|2|) only amounts to normalization of the 
frequencies, a transformation to so-called absolute frequencies [ p5| , ^ 

z„{t)^p„{t)e^v{Y.'^.' Pcr'{r)dT) (14) 
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then reduces the system to the hnear equation 

z^{t) = {M+n)zJt) (15) 

where the mutation and reproduction matrices, A4 = {ma-a-') and TZ = diag(r^), 
may now be conveniently represented in the frequency space W. Defining 



-J . ((»J-ll4)® K0fT'3)(»(®^-J-ll4) (16) 

where tr", a € {0, x, z}, are the real Pauli matrices and cr" = I2, we find 

N 

M^Y. h'^?'"^ + Ms'rf + M3^5"'^^ - (Ml +m+ M3) l] (17) 

for the mutation matrix. The reproduction matrix TZ is, for a general fitness land- 
scape, an element of the algebra generated by o'j^'°-' and o-f'^\ 1 < J < 

AT k e 

^ E E E n n -ir^ 

'=/=! '"=1 "=i 

where [j-^ . . . j^.] is an ordered fc-tupel in {1, . . . , N}. Now, from a physical point of 
view, H = M + 71 is (up to a global minus sign) the Hamiltonian of two coupled 
Ising quantum chains in a tunable transverse magnetic field (the mutation) and 
general spin-interactions within the chains. 

Translated to our quantum chain model, the fitness function of the permutation 
invariant landscape defined in (|^) results in a (longitudinal) magnetic field and a 
mean field spin- interaction. We find TZ = TZa + TZ^ , where 

N 

= E ["I'^f'"^ + «2af + aaaf '^)] (19) 



and 



1 ^ 

Ti r (z-O) (z.O) , (0,z) (0,z) , (z,z) (z,z)l /r,r,\ 

^■^""^ 2^ P^^j ■ "^fe +^2ff} '(y\ ' +ii-,<j) >(j\ ■ (20) 

Let us stress that, in contrast to most physical applications, the mean field model 
is a much more natural approach in the biological context where interactions are 
typically long-range. So, it is a legitimate model here, not an inevitable approxi- 
mation. 

3.3 Biological and physical observables 

In this subsection, we relate the quantities of biological interest, mean and variance 
of the surplus components and the fitness, to the physical observables. In what 
follows, we assume the occuring limits to exist. 

Genotype composition According to (p^), the Hamiltonian of the quantum 
chain determines the time evolution of our population of genotypes in an environ- 
ment that does not constrain the population size. For any genotype-independent 
regulation of the population size, the relative genotype frequencies are found by sta- 
tistical normalization. We therefore define the vector of the genotype composition 
|p(i)) and the equilibrium composition |0) as 

l^'W) = TCTT^S^ ; 10) hm |p(t)> (21) 
(S2| exp(m)lPo) 
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where |pg) is the initial composition and 4~^|r2) is the equidistribution of genotypes. 
Note that the equilibrium composition of the genotype population just corresponds 
to the ground state of the quantum chain on the physical side (with a different 
'biological' normalization (fi|0) = 1). 

Fitness The density of the mean fitness (or mean fitness per site) of the population 
is given by the expression 

w{t) N-^f{t) = N-\n\n\p(t)) . (22) 

Since 

w := Inn w{t) = N-\m\0) = ^"'^^^ (23) 

the equilibrium mean fitness (per site) is just given by the (unique) largest eigenvalue 
of 7i, corresponding to |0). For an unconstrained population, w also determines the 
growth rate in the long-time limit. In the physical picture, (—w) is obviously just 
the ground state energy (per spin). 

Using A^|i7) = 0, we derive for the time evolution of the mean fitness 

w{t) = Vr{t)+N-\n\[}Z,M]\p{t)) (24) 

where Vr{t) is the variance of fitness (per site), 

Kit) = 1 {(m'lpit)) - immf) ■ (25) 

In the absence of mutation, ( p^ ) is of course just a special case of Fisher's "Funda- 
mental Theorem of Natural Selection" which states that the rate of increase in 
fitness is equal to the genetic variance in fitness. For the mutation-selection models 
considered here, the relation has the following intuitive interpretation: The change 
in mean fitness is driven by two independent forces. The first one stems from the 
change of genotype frequencies due to selection and is proportional to the variance 
of fitness values present in the population. Since variances are positive, it always 
tends to increase fitness. The second term on the right hand side of ( p^ ) typically 
decreases fitness. It measures the population mean of the change in fitness at time 
t due to the action of mutation. In mutation-selection equilibrium, both terms 
balance, and the entire residual variance is due to mutation. 



Surplus Another quantity that characterizes the genetic order of the population, 
as it may be measured from sequence data, is the mean surplus. We define, following 
and generalizing , 

Uiit) Si{cr)p^{t) ; Ui = lim Ui{t) . (26) 

fT 

In particular, 

#„(0 := J(3 - (uiit) + U2{t) + U3{t))) (27) 
measures the mean number of mutations per site relative to the wildtype while 

(28) 

denotes the mean number of transversions alone. As a biological order parameter, 
the mean surplus plays a similar role as the physical magnetization. However, as 
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already noted in both quantities are quite distinct and in many cases not even 
easily related. In the language of the quantum chain, the equilibrium mean surplus 
may be derived as 

"1- , , U3- , (29) 

whereas the three-component magnetization is defined as the ground state expec- 
tation value 

"'^ - Nm — ' - N{o\o) — ' - Nm — • 

As we will show below, magnetization and surplus can show rather different be- 
haviour especially near phase transitions. The biological and physical phase dia- 
grams, however, coincide if phase transitions (or error thresholds) are defined as 
nonanalyticity points of the ground state energy (or mean fitness) w in the thermo- 
dynamic limit (cf. the discussion in Section 5). 



4 Results 

4.1 Fujiyama model 

As in the two- letter case the quantum chain model decomposes into non- 
interacting one-site Hamiltonians for the additive landscape. The mean fitness 
and its variance are linear functions in the surplus components. In particular, we 
obtain from ( |2^ ) 

Vr{t) = w{t) + 2((/ii + ^3)aiUi{t) + {^2 + A*3)a2W2(t) + ipi + fJ-2)a3U3{t)) . (31) 

For Jukes-Cantor mutation, /ii = /i2 = /^a = /i, this reduces to 

Vr{t)^ (^4pi+^^w{t) (32) 

and Vr is proportional to the mean fitness in the mutation-selection equilibrium. 
Exact results are easily found from the solution of the four-dimensional eigenvalue 
problem of the one-site Hamiltonian. We only give the expression for the mean 
fitness in the symmetric case, ai = a2 = CX3 = a with Jukes-Cantor mutation 
scheme {pi — ^2 — ^J'3 = ^J')■ 

_ exp[2t{a + fi)] cosh[2tQ] (a - 2^ + 2Q tanh[2tQ]) - a ~ Afi 
' ~ 1 + exp[2t(a + /x)] cosh[2iQ] ^ ' 

where 



Q = y/fi^ + a^-a^j. (34) 

and the equidistribution of genotypes is chosen as starting configuration. 

Means and variances of the fitness and the surplus in mutation-selection balance 
are shown in Fig. ^ below. A plot of the time evolution of fitness is given in 
Fig. Ij. There is clearly no phase transition (resp. no error threshold behaviour) for 
the additive Fujiyama landscape, as expected in view of the complete absence of 
interactions (resp. epistasis). 
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4.2 Quadratic fitness model: Equilibrium results 



In contrast to the additive case, no simple relation between surplus and fitness is 
known in the case of the quadratic landscape as long as t or N are kept finite. 
However, due to the permutation invariance of the Hamiltonian, the individual 
fitness-surplus relation (^) is recovered in the thermodynamic limit for the cor- 
responding mean values of the equilibrium population. We obtain in analogy to 



.5 

w = lim w{t) = (aiUi + -^u'^) (35) 

t^oo ^ — ' \ Z / 

i=l 

and, from (p^), for the equilibrium variance of fitness per site 

Vr = lim Vr{t) = 2(/ii + ^3) (aiui + 71 u^) + 

2(^2 + Aia) (a2U2 + 72W2) + 2(//i + /i2) (a3U3 + 73U3) ■ (36) 

The key to the solution in the thermodynamic limit is now the minimum principle of 
the physical free energy which translates to a maximum principle for the equilibrium 
mean fitness. Maximizing 

{x\M+n\x) -iu{{x\x) -I) (37) 

with respect to w and x, we obtain, taking permutation symmetry of x into account, 
the following variational expression for w: 



w(q;,/x,7) = sup 

mi ,7712 ,m3 

Ml 



,71 2 , 72 2 , 73 2 
aimi + a2m2 + asms + + "^"^2 + "^'^s 



(v/(l + m2)2-(TOi+m3)2 + ^(l-TO2)2-(mi-m3)2 - 2) + 
^ (^\/ (1 + mi)2 — (to2 + "^3)2 + a/(1 - TOi)2 - (m2 — ms)^ - 2^ + 
^ (v/(l + m3)2 - (mi + 7712)2 + V(l - - (mi - m^^ - 2) 



(38) 



where rrii G [—1,1] arc the components of the physical magnetization. Let us stress 
that, from the biological point of view, the translation to the physical framework 
seems a necessary technical step since we do not know of any variational principle 
for the biological model which works directly in L^. We now take a closer look at 
two special cases. 



Symmetric fitness model For the symmetric wildtype-mutant model with = 
a, 7i = 7 and Jukes-Cantor mutation rate all components of the order parameters 
are equal, nii = m and Ui = u, respectively. Here, the variational expression 
for w leads to the following self-consistency condition for m: 



2(0; + 7m) — /i 



yj{a + ^rtiY — m(" + 7"i) + 



(39) 



This is a quartic equation in m and can be solved using the standard formulas. 
However, since the explicit solution is rather lengthly, we do not include it here, 
but give a qualitative discussion instead. 

Obviously, the relation has a unique real solution for any a and /i whenever 7 is 
negative. Like in the case of the two-state model, we thus obtain no phase transition 



11 



for positive epistasis. In the following, we therefore concentrate our discussion on 
positive 7 (or negative epistasis). Note that, for calculations in the thermodynamic 
limit, always the fitness function / (^), and hence the reproduction matrix TZ-^ (pO|), 
can be used instead of the truncated form / ( p^ ) , since the frequencies of genotypes 
with negative surplus vanish. For ai = 0, this is due to spontaneous breaking of 
the extra C2 x C2 symmetry oiTi. = M + TZj. 

In contrast to the two-state model, where a phase transition in the thermo- 
dynamic limit is only found for zero external field, it turns out that the present 
model has phase transitions for a whole range of the linear fitness parameter a 
when epistasis is negative: For a := a/7 in the interval 

0.0515668 (40) 

we find a first order phase transition of the system at 

M := - = Mc = ^ + 25 (41) 
7 3 

with a finite jump in the magnetization from 771+ to to- where 

m± = ^(l± \/\ - - 18(5^ . (42) 




From 777 we derive the mean fitness w using (jS^), from w we obtain the surplus 
u via (35) and, finally, the variance of the fitness Vr = 12/i(Q;M -I- 777,^). Looking 



at the surplus tt,, we also find a phase transition at /2 = [i^. As 777, it vanishes in 
the disordered phase for a = 0. Note however that, since w is continuous, due to 



the relation (35), also the surplus is continuous at a phase transition. In [Q it has 
been shown that these differences of the biological and physical order parameters 
arise with the change from classical to quantum mechanical probabilities (resp. 
the change from L} to I?) in translating the biological model into the physical 
one. We remark that a different, discontinuous behaviour of the biological order 
parameter at a (physical) first order transition has been observed for the sharply 
peaked landscape in Eigen's quasispecies model ||l0| . Mean fitness and its variance, 
magnetization, and surplus for different values of a are shown below in Fig. |^. 



Transition— transversion model In our second example, we wish to distinguish 
mutations between like and unlike nucleotides. In a first step, we retain the sym- 
metric fitness landscape 71 = 72 = 73 = 7 (for simplicity with vanishing linear part 
a = 0), but let the relative frequencies of transitions and transversions differ by 
assuming the Kimura 2 parameter mutation scheme, /ii = /j.^ = fi ^ 112- 

In the extended parameter space of the reduced mutation rates fl = fJ./"f', fl2 = 
/i2/7, we now obtain a phase diagram with three distinct phases (see Fig. ^). 

• For jl and /22 sufficiently small, all three surplus components are positive, 
indicating genetic order with respect to the entire 4-letter alphabet of the 
nucleotides: ACGT phase. 

• If we increase the mutation rate p,2 for low jj, the system crosses over to a 
phase which does no longer distinguish between the different kinds of purins 
(A,G) and pyrimidins (C,T), but is still ordered with respect to transversions. 
This is the limiting case described by the two-state model. We call this the 
PP phase. 

• For higher mutation rates /2, /X2, we finally enter a completely disordered phase 
with vanishing fitness and surplus. 
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Figure 3: Mean fitness and its variance, surplus and magnetization in the symmetric 
fitness model for various linear parts of the fitness function in the infinite sites limit. 



In a second step, we now also let the mutation effects of transitions and transversions 
differ and assume a fitness landscape with 72 = 73 = 7, but 71 ^ 7 in general. The 
changes in the phase diagram for increasing 71 = 71/7 are shown in Fig. |^. The 
phase transitions between the three phases may be first or second order. In general, 
we obtain the following phase space structure: 

Phase transitions between the disordered and PP phase are second order and 
located on the line fl — 71/2. This phase transition corresponds to the one 
also seen in the two-state model j2) . 



The phase transition line between the ACGT and PP phases in general changes 
from first to second order with increasing mutation rate jl2 (see Figs. ^, 
For the second order transitions we derive, on expanding ( |38|) to lowest order 
in m2 = TO3, 

7i 



A' = ^^V(7i+M2)(27-A*2). (43) 
71 + 27 

Numerically, we find that the first order transitions are located on a straight 
line up to /i = 71/2 where the PP phase changes into the disordered phase. 
The /i2-interval of first-order transitions decreases for increasing 71 . For 71 ^ 
8.45, all phase transitions between the ACGT and PP phases are second order. 

Finally, for 71 < 4, there are direct first order phase transitions between the 
ACGT phase and the disordered phase (for fi2 sufficiently small). For higher 
values of 71, these two phases are separated by the PP phase. 
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1.4: 




Figure 4: Phase diagram of the transition-transversion model with with symmetric 
fitness landscape and Kimura 2 parameter mutation scheme. Solid and dotted lines 
correspond to first and second order phase transitions, respectively. The dashed 
line indicates the Jukes-Cantor mutation scheme. 

Y, = 10 
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4 

3 - ' • ... 

2 - 
1 - 

' ' ' ' ' 

0.5 1 1.5 2 2.5 

Figure 5: Phase diagrams for anisotropic fitness landscapes 71 > 72 = 73 = 7 and 
Kimura 2 parameter mutation scheme. Solid and dotted lines correspond to first 
and second order phase transitions, respectively. 

As for the symmetric fitness function discussed above, there are no compact 
analytic expressions for the fitness or the surplus in the ACGT phase. In the 
PP phase, however, the following values for the mean fitness and the non-zero 
components of the mean surplus and the magnetization are found: 

The variance in fitness per site, finally, is proportional to the mean fitness in the PP 
phase: Vr — 8/iw. Note that all these expressions are independent of the transition 
rate /i2 and directly comparable to the results of the two-state model ^ by 
idebtifying {++,H — } with '+' and { — h, } with ' — '. 

4.3 Quadratic fitness model: Finite sequence length 

For the Fujiyama model with independent sites, all the quantities calculated here, 
means and variances per site in infinite populations, are independent of the assumed 
length N of the sequences. This is no longer the case for models including epistasis. 
In this subsection, we therefore present a quick numerical investigation of the sym- 
metric fitness model for finite system sizes and compare the results with those in 
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the thermodynamic Hmit. Since the frequencies of genotypes with negative values 
of the surphis no longer vanish for finite sequences, we use the truncated fitness 
function (|lO|), with 7^ = 7 > and ai = for our calculations. 

All results are obtained by direct numerical solution of the eigenvalue problem 
in the l)(iV + 2)(7V + 3) /6] -dimensional vector space of permutation invariant 

population vectors. Numerically precise calculations have been performed up to 

= 60 (39711-dim.), the results are shown in Fig. It is seen that the mean 
surplus and the mean and the variance of the fitness rapidly approach the limiting 
curves and behave qualitatively different from the Fujiyama model even for very 
small system sizes. We also show the finite-size behaviour of the variance of the 
surplus Vs- Since this quantity vanishes as 1/-/V, it is not obtainable from the 
leading order terms in the thermodynamic limit. In our finite size calculations, 
we rescale V, with the sequence length to obtain comparable results. Whereas Vg 
is monotonously increasing for the additive model (where NVs = 1 — u^), it runs 
through a maximum for quadratic fitness. Note that this maximum, in contrast to 
the variance of fitness, is located directly at the error threshold. The behaviour is 
qualitatively similar to the two-state model pi|. 
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Figure 6: Equilibrium behaviour of fitness and surplus of the symmetric fitness 
model with finite sequence length. Results for the Fujiyama model with scaling 
a = 7/2 are also shown. 

Since there has been some discussion recently on the correct scaling of fitness 
values and mutation rates with the length of the sequence (cf |l^, |^), let us fi- 
nally remark that the finite size results in this and the next section show that our 
choice, keeping fitness and mutation rate per site fixed, is adequate for all quantities 
considered here. 
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4.4 Quadratic fitness model: Time evolution 



Originally, the error threshold has been defined as an equilibrium phenomenon (cf 
H, Q): For special forms of the fitness landscape, there is a finite critical value /ic of 
the mutation rate beyond which genetic order is no longer maintained by selection. 
For the four-state model with quadratic fitness, this situation has been discussed 
above. However, for a suitable fitness function, the threshold is not necessarily 
connected with high mutation rates. In this subsection, we consider the relaxation 
of a non-equilibrium population to mutation-selection balance. It turns out that, 
depending on the starting configuration, an even stronger threshold effect may be 
observed in the time evolution of the fitness and the surplus for all mutation rates 
below the critical equilibrium value. 



Zero-mutation limit of the transition-transversion model The essence of 
the threshold phenomenon in the time evolution is already contained in the selection 
dynamics alone. In a first step, we therefore disregard mutation altogether by 
working in the zero- mutation limit. Obviously, we then deal with a classical mean- 
field model on the physical side. As our starting configuration, we choose the 
completely unstructured population with an equidistribution of genotypes |pq) — 
A~^\rt). In this particular situation, some progress is possible also analytically. 
Noting that 

^ ^ (^]|C'expft7^)|0) ^ tr(Cexp(m)) 

^ {n\exp{m)\n) tr(exp(^7^)) ^ ' 

for any element C of the algebra generated by {o'l^''^\ crf^''''^}, the biological and 
physical pictures coincide in this case. Using the fitness function of the transition- 
transversion model with 72 = 73 = 7 > 0, we obtain the following implicit equations 
for the time evolution of the surplus components: 

_ sinh(27tu) ^^g^ 



cosh(27tM) -I- exp[-27it(2ucoth(27iii) - 1)] 

^ cosh[-ftQ{ui)] ~ exp(-27ifai) 
cosh["ftQ{ui)] -I- exp(— 27itMi) 



where 



Qiui) = V(l + "1)' - exp(47itui)(l - ui)2 . (48) 



The resulting dynamical phase diagram is shown in Fig. ^ As in the equilibrium 
situation, there are three phases. Depending on the ratio 71 = 71/7, the system 
directly crosses to an ordered phase after a sharply defined waiting time tc, or 
performs two consecutive transitions, entering the PP phase in the first one. 

As in the equilibrium phase diagram, the dynamical transitions may be of first 
or second order. 

• Second order transitions are located at t = 7t = 1 for 7 < 1/4 and at t = 1 /71 
for the transition from the disordered phase to the PP phase. The transition 
from the PP phase to the ACGT phase is second order above 71 ~ 1.9009 and 
implicitly given through 2tc = 1 -f exp[27i(tc — 1)]- A similar second order 
transition (with a one-component order parameter) has also been observed in 
the two-state model |2^, |2^ . 

• In an interval around the symmetry point 71 — 7, the system possesses a first 
order transition (in the sense that there is a finite jump in the magnetization). 
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Figure 7: Dynamical phase diagram of the transition-transversion model for vanish- 
ing mutation starting from the equidistribution. (Solid: first order; dashed: second 
order transition). Right: Time evolution of the surplus components for 71 = 2. 



Note that, in contrast to the equilibrium case, also the surplus and even the 
mean fitness are discontinous on this line, giving rise to a rather pronounced 
threshold effect in the evolution dynamics (cf. the solid line in Fig. ^ for 7 = 1). 

As for the equilibrium values, we also consider the effect of finite sequence lengths 
on the time evolution. Again, calculations are performed by direct diagonalization 
of the symmetric fitness model (7 = 1). Fig. ^ shows how the jump discontinouity in 
the mean fitness (internal energy) and the delta function singularity in the variance 
of the fitness (specific heat) are approached by the finite systems. A threshold 
phenomenon is absent in the time evolution of the Fujiyama model which is also 
shown in Fig. H. 
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Figure 8: Time evolution of the equidistribution of genotypes in the zero mutation- 
limit of the symmetric fitness model for different sequence lengths. 



Finite mutation rates and different starting configurations In a last step, 
we now discuss the influence of the mutation rate and the starting configuration 
on the evolution dynamics. Consider first the time evolution of the equilibrium 
distribution of genotypes 4~^|rj). Although no analytical results are available here, 
we may give the following intuitive argument that there is a phase transition at finite 
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t — tc for any mutation rate below the critical equilibrium mutation rate ^c'- Since 
mutation alone tries to keep the population in the equilibrium distribution, the 
evolution dynamics will be slowed down by mutation for small t. In particular, 
mean fitness and surplus will remain zero on a finite interval at least up to the 
threshold value of the corresponding model with vanishing mutation. On the other 
hand, the limiting values of w and u are finite for /i < /ic, giving rise to a non- 
analytical point of w{t) and u{t) at some finite t — tc- As shown in the upper graph 
of Fig. ^, this behaviour is clearly visible in numerical results for finite sequence 
sizes. 
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Figure 9: Time evolution of the variance of the fitness in the symmetric fitness 
model with sequence length N = 60. Results are shown for varying mutation rates 
and two different starting configurations. 

In order to contrast the time evolution of the unstructured population with 
an equidistribution of genotypes as starting configuration, we have also performed 
calculations for the opposite case of a population with initially homogeneous phe- 
notypes. Here, at t = 0, any "individual" in the population has the same value 
Si = for the three surplus components. The result (for finite sequence length 
N = 60) is shown in the lower viewgraph of Fig. ^. As for the equidistribution of 
genotypes, there is a clear threshold effect in the time evolution for any finite value 
< fi < fic of the mutation rate. The transition appears to be particularly sharp 
for small mutation rates. In contrast to the unstructured case, the critical waiting 
time tc for the transition is no longer monotonously increasing with the mutation 
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rate fi, but is separated in two regimes: For mutation rates near the equilibrium 
threshold value fic, the situation is similar to the unstructured case: Here, single 
mutants with higher fitness appear in the population after a short while. Due to 
the continuing mutation pressure, however, a certain time is needed for these fitter 
individuals to grow to a finite proportion and to dominate the mean values in the 
infinite population. For small fi, on the other hand, the critical waiting time tc is 
dominated by the time needed for mutation to explore the configuration space and 
to generate individuals with higher fitness at a sufficient rate. 



5 Discussion 

When in 1^ a class of models for sequence space evolution was introduced, using 
the framework of Ising quantum chains, the calculations started with four major 
simplifications of the biological situation. These are the consideration of a two- 
state model, the assumption of an infinite sequence length, the use of simplistic 
fitness landscapes, and the restriction on infinite population sizes. In this paper, we 
have looked at the first two of these simplifying assumptions. Finally, an extended 
discussion of the evolution dynamics of these models has also been presented. In 
the following paragraphs, we give a summary of our findings and an outlook on the 
remaining open problems. 



Two-state versus four-state models. The main concern of this contribution 
is the generalization of the modelling framework, introduced in to four states 
(corresponding to the four nucleotides) on each site. The generalization presented 
makes use of the C2 x C'2 symmetry inherent in the Kimura 3 5*7 mutation scheme. 
On the 'physical side' this leads to a model of two coupled Ising quantum chains 
(rather than to a four-state Potts model). Compared with the two-state model, 
the extension can be thought of as consisting of two steps. In a first step, we 
represent the four states on each site by the spin values of two spins in decoupled 
chains. Note that already in this simplified model three phases occur in the phase 
diagram since the transition lines of the two decoupled chains will not in general 
coincide. The second step consists of the introduction of a more realistic mutation 
scheme which also changes the configuration space topology and the corresponding 
use of a refined fitness landscape. Both these extensions lead to a coupling of 
the chains, and an even richer phase space structure is found, including first-order 
transitions. As may be seen from the introduction of a small linear field term 
into the fitness function in subsection 4.2, this change of the transition to first 
order leads to an increased robustness of the threshold phenomena with respect to 
symmetry-breaking perturbations . 



Finite sequence length. Typical sequence lengths of enzymes or viruses are of 
the order 10^ - 10^. While these numbers are certainly far off the typical sizes of 
macroscopic systems in physics, they are, in principle, large enough to successfully 
supress l/7V-corrections. However, especially models with simple fitness landscapes 
describe - at best - the evolution dynamics in a very restricted configuration space 
of particularly 'important' sites, disregarding neutral or altogether lethal mutations. 
In view of this fact, consideration of finite sequence lengths is indispensible and cal- 
culations in the thermodynamic limit even seem to be questionable at first sight. In 
order to clarify the usefulness of infinite-size methods in this context, we performed 
a number of numerical calculations for finite sequence lengths. The results are quite 
encouraging. As shown in subsection 4.3, the characteristic properties of the ther- 
modynamic limit are well visible even for tiny sequence sizes, such as iV = 10, and 
the approximation is already quantitatively reasonable for sequences of length 60. 
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The fitness landscape. The construction of a tractable fitness landscape which 
nevertheless comprises the relevant biology is certainly the major task for all these 
models. In this contribution, in order to obtain at least some analytical results, 
we have chosen a fitness function from the smooth end of the landscape zoo. Due 
to its permutation invariance, the quadratic fitness function effectively disregards 
any local variance in the interaction between sites, but only considers the average 
epistatic effect. As such, it is in many respects certainly no more than a toy- 
model for evolution. However, the assumption of permutation invariance of the 
sites is quite common in evolutionary biology and comprises a large number of 
standard models for evolution, such as the quadratic optimum model or Eigen's 
original sharply peaked landscape. The results show that the essential structure 
responsible for characteristic effects such as the error threshold is already contained 
in this simplified framework and may also serve as a reference for future work on 
fitness functions with increased ruggedness, such as the NK-landscape hierarchy 
[l^ . Here, we expect the results for the quadratic fitness model to be qualitatively 
stable at least under certain forms of mild ruggedness, such as the introduction of 
site-randomness in the fields and interactions Q. Pronounced changes, on the other 
hand, should be expected when spin-glass effects come into play. 



Finite population size. In going from the deterministic limit to the evolution of 
finite populations, the ordinary differential equation (|^) has to be replaced by the 
master equation of a stochastic process which is no longer covered by the theoretical 
framework presented in this article. Due to the complexity of the stochastic equa- 
tions, analytical results seem to be out of reach at present for all but the simplest 
selection schemes. Monte-Carlo simulations, however, should be possible and could 
considerably add to theoretical insight here. 

Although the general picture of the deterministic case should persist at least 
for sufficiently large populations, the study of finite population effects is certainly 
of importance. For related models, such as the quasispecies model with the single 
peaked landscape, it is has been found [ p^ that the deterministic results can be 
interpreted as the time averages of the stochastic process for mutation rates outside 
a certain interval around an error transition. Directly at the threshold, however, 
large fluctuations and a jump in the long-time averages appear in the stochastic 
system at a critical mutation rate which seems to be lower by an amount roughly 
proportional to 1/y/N in comparison with the deterministic case. Mainly because 
of these expected finite population effects we have restricted discussions in this 
article entirely to the phase space structure of the models and the order of the 
phase transitions. Any further details of the transitions, even critical exponents, 
will presumably never be visible in real biological systems and thus seem to be of 
limited relevance in this context. 

Let us finally remark that, although biological populations are certainly finite, 
the consideration of the infinite population limit is not (only) a technical necessity, 
but also of direct importance for the study of the error threshold. That is so because 
this effect, in distinction to the phenomenon of MuUer's ratchet, is by definition not 
due to genetic drift, but solely due to the form of the fitness function. It has 
thus always to be shown that the threshold effect persists even for infinitly large 
population sizes. 



Error threshold behaviour. Since there are more than one and sometimes con- 
fiicting definitions of the error threshold in literature (cf. the discussion in Q), let 
us start this paragraph with a few clarifying remarks. In this article, following Q, 
we use the notion of the error threshold as equivalent to phase transitions. As such, 
a clear-cut mathematical definition (as non-analytical points in the mean fitness) is 
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possible only in the infinite sites (or thermodynamic) limit. However, since the ther- 
modynamic limit can be considered as an excellent approximation already for rather 
small systems, the infinite system property gives a valid explanation for prominent 
features which are observable for finite sequences as well. In our study, we have 
always considered sequences of a fixed length and have treated the mutation rate 
per site as the variable driving the transition. In comparing systems of different 
length, we have scaled the variables such that a well-defined limit is approached as 
N oo. In particular, the 'critical' mutation rate per site in a finite system quickly 
converges to the limiting value jlc- Originally, the threshold has been viewed as a 
limitating factor on the sequence length This, however, should not be confusing: 
We switch to this latter picture simply by letting the reduced mutation rate depend 
linearly on the sequence length, jl N, and obtain a critical length ~ jlc (for 
sufficiently large sequences). 

Our results on the error threshold phenomenon fit previous ones for the two-state 
case and related models in that negative epistasis is needed to observe a transition 
(cf. |2^, ^). Contrary to the two-state case, the threshold corresponds to a first- 
order transition for certain parameter ranges and persists for a sufficiently small 
linear part in the fitness function. Both, the equilibrium and the dynamical phase 
diagram of the transition-transversion model (with at — 0), possess two ordered 
phases characterized by non-zero values of one or all three components of the surplus 
order-parameter and the disordered phase with zero surplus where selection ceases 
to operate. The threshold effect appears to be especially sharp in the evolution 
dynamics, where a jump in the mean surplus and fitness and a delta singularity in 
the variance of fitness occurs. 

Besides the threshold effect, however, other properties of mutation-selection 
models may be studied within the framework presented. After all, exclusive con- 
centration on phase transitions is perhaps too much a physicist's point of view on 
these systems. The relations between surplus, mutation rate and the variance of 
fitness (p^, (p6[), for example, are valid for the entire time evolution and arbitrary 
mutation rates. Depending on the fitness function applied, they may give rise to 
characteristic features also far off the transition point. This is particularly explicit 
for the equilibrium variance of fitness which runs through a pronounced maximum 
for fitness functions with negative epistasis at a mutation rate much smaller than 
the threshold value. 
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